[SYCL] [AMDGPU] Ignore incorrect sub-group size #11687

jchlanda · 2023-10-27T09:20:37Z

CDNA supports only 64 wave front size, for those GPUs allow subgroup size of 64. Some GPUs support both 32 and 64, for those (and the rest) only allow 32.

CDNA supports only 64 wave front size, for those GPUs set subgroup size to 64. Some GPUS support both 32 and 64, for those (and the rest) only allow 32.

jchlanda · 2023-11-01T09:59:42Z

Friendly ping @intel/dpcpp-cfe-reviewers

clang/test/SemaSYCL/reqd-sub-group-size-amd_64.cpp

clang/include/clang/Basic/DiagnosticSemaKinds.td

…_warning

clang/test/SemaSYCL/reqd-sub-group-size-amd_32.cpp

…_warning

jchlanda · 2023-11-08T07:16:11Z

Added a test to make sure that the values are indeed ignored: 0a72b7b

smanna12

LGTM

againull · 2023-11-08T22:57:13Z

Test failure:
SYCL :: OptionalKernelFeatures/is_compatible/is_compatible_with_aspects.cpp
seems to be related to the patch.

jchlanda · 2023-11-09T14:22:30Z

Test failure: SYCL :: OptionalKernelFeatures/is_compatible/is_compatible_with_aspects.cpp seems to be related to the patch.

That's right, this patch changes the handling of incorrect sub-group attribute, such that it doesn't end up in the resulting binary image.
The way this test works is that it creates a kernel with a known, incorrect required sub group size (INT_MAX). Then expects that kernel to be incompatible with the device, by failing a check against supported sub-group sizes.

There are two ways of thinking about that issue:

since we warn and ignore the incorrect value of sub-group size: https://github.com/intel/llvm/pull/11687/files#diff-2a5bdb2d9f07f8d77de51d5403d349c22978141b6de6bd87fc5e449f5ed95becR4027 this test is not applicable anymore,
or we shouldn't ignore the incorrect values and only warn.

I'm not sure which is the correct way of handling those values.
Would you be able to advise @againull @elizabethandrews @smanna12 ?

elizabethandrews · 2023-11-09T21:02:22Z

Test failure: SYCL :: OptionalKernelFeatures/is_compatible/is_compatible_with_aspects.cpp seems to be related to the patch.

That's right, this patch changes the handling of incorrect sub-group attribute, such that it doesn't end up in the resulting binary image. The way this test works is that it creates a kernel with a known, incorrect required sub group size (INT_MAX). Then expects that kernel to be incompatible with the device, by failing a check against supported sub-group sizes.

There are two ways of thinking about that issue:

since we warn and ignore the incorrect value of sub-group size: https://github.com/intel/llvm/pull/11687/files#diff-2a5bdb2d9f07f8d77de51d5403d349c22978141b6de6bd87fc5e449f5ed95becR4027 this test is not applicable anymore,

or we shouldn't ignore the incorrect values and only warn.

I'm not sure which is the correct way of handling those values. Would you be able to advise @againull @elizabethandrews @smanna12 ?

I don't think it makes sense allowing an invalid subgroup size to pass through if we're emitting the warning. IMO we should either just allow the incorrect value to pass through without a warning and then have the backend deal with it however it is doing so currently, or we do what this PR does - i.e. emit warning and drop the attribute. @AlexeySachkov can you weigh in here? Is there a reason we are passing though invalid subgroup size?

jchlanda · 2023-11-15T07:49:24Z

Any preference on this @AlexeySachkov ?

jchlanda · 2023-11-20T13:05:02Z

Any preference on this @AlexeySachkov ?

@AlexeySachkov friendly ping on this one please.

AlexeySachkov · 2023-11-20T16:55:34Z

Test failure: SYCL :: OptionalKernelFeatures/is_compatible/is_compatible_with_aspects.cpp seems to be related to the patch.

That's right, this patch changes the handling of incorrect sub-group attribute, such that it doesn't end up in the resulting binary image. The way this test works is that it creates a kernel with a known, incorrect required sub group size (INT_MAX). Then expects that kernel to be incompatible with the device, by failing a check against supported sub-group sizes.
There are two ways of thinking about that issue:

since we warn and ignore the incorrect value of sub-group size: https://github.com/intel/llvm/pull/11687/files#diff-2a5bdb2d9f07f8d77de51d5403d349c22978141b6de6bd87fc5e449f5ed95becR4027 this test is not applicable anymore,

or we shouldn't ignore the incorrect values and only warn.

I'm not sure which is the correct way of handling those values. Would you be able to advise @againull @elizabethandrews @smanna12 ?

I don't think it makes sense allowing an invalid subgroup size to pass through if we're emitting the warning. IMO we should either just allow the incorrect value to pass through without a warning and then have the backend deal with it however it is doing so currently, or we do what this PR does - i.e. emit warning and drop the attribute. @AlexeySachkov can you weigh in here? Is there a reason we are passing though invalid subgroup size?

From SYCL spec point of view, there are no compile-time known incorrect sub-group sizes - what we have for AMD and CUDA are implementation details of those backends.

Explicitly requesting sub-group size of a kernel has limited portability by design, but considering how fundamental and narrow the selection of possible sub-group sizes for AMD/CUDA, it might make sense to promote that knowledge into the compiler in form of the suggested warning to improve user experience.

I think that it may actually make sense to pass the invalid sizes through even if we diagnosed them with a warning: this way runtime handling of the attribute for AMD/CUDA backend would be uniform with other backends and users will get more consistent experience. At the same time the warning will let users know in advance that their application won't work and it requires some code changes.

The only concern about the warning I have is that it may produce false alarms: what if there is a kernel which is never submitted to a AMD/CUDA device? The warning will still be there and it may be annoying for someone who uses -Werror. Once we get proper support for optional kernel features for AOT targets it may be even easier to discover that

jchlanda · 2023-11-21T13:12:18Z

Test failure: SYCL :: OptionalKernelFeatures/is_compatible/is_compatible_with_aspects.cpp seems to be related to the patch.

That's right, this patch changes the handling of incorrect sub-group attribute, such that it doesn't end up in the resulting binary image. The way this test works is that it creates a kernel with a known, incorrect required sub group size (INT_MAX). Then expects that kernel to be incompatible with the device, by failing a check against supported sub-group sizes.
There are two ways of thinking about that issue:

since we warn and ignore the incorrect value of sub-group size: https://github.com/intel/llvm/pull/11687/files#diff-2a5bdb2d9f07f8d77de51d5403d349c22978141b6de6bd87fc5e449f5ed95becR4027 this test is not applicable anymore,

or we shouldn't ignore the incorrect values and only warn.

I'm not sure which is the correct way of handling those values. Would you be able to advise @againull @elizabethandrews @smanna12 ?

I don't think it makes sense allowing an invalid subgroup size to pass through if we're emitting the warning. IMO we should either just allow the incorrect value to pass through without a warning and then have the backend deal with it however it is doing so currently, or we do what this PR does - i.e. emit warning and drop the attribute. @AlexeySachkov can you weigh in here? Is there a reason we are passing though invalid subgroup size?

From SYCL spec point of view, there are no compile-time known incorrect sub-group sizes - what we have for AMD and CUDA are implementation details of those backends.

Explicitly requesting sub-group size of a kernel has limited portability by design, but considering how fundamental and narrow the selection of possible sub-group sizes for AMD/CUDA, it might make sense to promote that knowledge into the compiler in form of the suggested warning to improve user experience.

I think that it may actually make sense to pass the invalid sizes through even if we diagnosed them with a warning: this way runtime handling of the attribute for AMD/CUDA backend would be uniform with other backends and users will get more consistent experience. At the same time the warning will let users know in advance that their application won't work and it requires some code changes.

Following you suggestion, I've let the invalid value pass through. I've removed a test, since no values are being ignored.

The only concern about the warning I have is that it may produce false alarms: what if there is a kernel which is never submitted to a AMD/CUDA device? The warning will still be there and it may be annoying for someone who uses -Werror. Once we get proper support for optional kernel features for AOT targets it may be even easier to discover that

I feel that it's an edge case, that is outweighed by the benefit of informing the users about the incorrect size, and perhaps we could ignore it for now. If that ever becomes an issue, it would be simple enough to provide a mechanism to silence this particular warning?

AlexeySachkov · 2023-11-21T13:50:38Z

The only concern about the warning I have is that it may produce false alarms: what if there is a kernel which is never submitted to a AMD/CUDA device? The warning will still be there and it may be annoying for someone who uses -Werror. Once we get proper support for optional kernel features for AOT targets it may be even easier to discover that

I feel that it's an edge case, that is outweighed by the benefit of informing the users about the incorrect size, and perhaps we could ignore it for now. If that ever becomes an issue, it would be simple enough to provide a mechanism to silence this particular warning?

I think that we may encounter that sooner than later, but I do agree that benefit of the warning probably outweighs the -Werror concern. I'm fine with the direction of this PR, but we should keep in mind that at some point we will likely need a knob to disable that particular warning.

al42and · 2023-11-21T13:58:48Z

I feel that it's an edge case, that is outweighed by the benefit of informing the users about the incorrect size, and perhaps we could ignore it for now. If that ever becomes an issue, it would be simple enough to provide a mechanism to silence this particular warning?

As a person compiling with -Werror and for multiple sub-group sizes (toy example), I mostly agree there; being able to suppress the warning is enough.

It looks like currently the required error suppression would be too broad. Originally (#6183), -Wno-cuda-compat was enough, which does not suppress too much relevant warnings. Now, -Wno-unknown-attributes would be needed, which can hide true positives.

#11991) As a follow up to #11687, this PR adds a mechanism to silence the warning using dedicated switch `-Wno-incorrect-sub-group-size` that is wrapped in the `-Wno-attribute` group.

jchlanda requested a review from a team as a code owner October 27, 2023 09:20

jchlanda temporarily deployed to WindowsCILock October 27, 2023 09:23 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock October 27, 2023 10:18 — with GitHub Actions Inactive

[SYCL] [AMDGPU] Ignore incorrect sub-group size

ac489a5

CDNA supports only 64 wave front size, for those GPUs set subgroup size to 64. Some GPUS support both 32 and 64, for those (and the rest) only allow 32.

jchlanda force-pushed the jakub/amd_sub_group_warning branch from 8c22cd8 to ac489a5 Compare October 30, 2023 17:10

jchlanda temporarily deployed to WindowsCILock October 30, 2023 17:11 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock October 30, 2023 17:48 — with GitHub Actions Inactive

elizabethandrews reviewed Nov 1, 2023

View reviewed changes

clang/test/SemaSYCL/reqd-sub-group-size-amd_64.cpp Outdated Show resolved Hide resolved

PR fixes

7076bdd

jchlanda requested a review from elizabethandrews November 1, 2023 13:52

jchlanda temporarily deployed to WindowsCILock November 1, 2023 13:53 — with GitHub Actions Inactive

elizabethandrews reviewed Nov 1, 2023

View reviewed changes

clang/include/clang/Basic/DiagnosticSemaKinds.td Outdated Show resolved Hide resolved

jchlanda temporarily deployed to WindowsCILock November 1, 2023 14:29 — with GitHub Actions Inactive

Merge remote-tracking branch 'upstream/sycl' into jakub/amd_sub_group…

23892e1

…_warning

jchlanda requested a review from elizabethandrews November 6, 2023 14:32

jchlanda had a problem deploying to WindowsCILock November 6, 2023 14:33 — with GitHub Actions Failure

jchlanda had a problem deploying to WindowsCILock November 6, 2023 15:23 — with GitHub Actions Failure

elizabethandrews reviewed Nov 6, 2023

View reviewed changes

clang/test/SemaSYCL/reqd-sub-group-size-amd_32.cpp Show resolved Hide resolved

Early return and merge warning

e9e31ee

jchlanda temporarily deployed to WindowsCILock November 7, 2023 09:43 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock November 7, 2023 10:17 — with GitHub Actions Inactive

jchlanda added 2 commits November 7, 2023 09:20

Add CodeGen test to make sure that attributes are indeed ignored

0a72b7b

Merge remote-tracking branch 'upstream/sycl' into jakub/amd_sub_group…

5d0f2aa

…_warning

jchlanda temporarily deployed to WindowsCILock November 7, 2023 14:37 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock November 7, 2023 15:15 — with GitHub Actions Inactive

elizabethandrews approved these changes Nov 8, 2023

View reviewed changes

smanna12 approved these changes Nov 8, 2023

View reviewed changes

PR feedback

82f3c5e

jchlanda requested a review from AlexeySachkov November 21, 2023 13:12

jchlanda temporarily deployed to WindowsCILock November 21, 2023 16:34 — with GitHub Actions Inactive

jchlanda temporarily deployed to WindowsCILock November 21, 2023 17:50 — with GitHub Actions Inactive

againull merged commit 6bce7f6 into intel:sycl Nov 21, 2023

jchlanda mentioned this pull request Nov 23, 2023

[SYCL] Provide a mechanism to silence incorrect sub group size warning #11991

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[SYCL] [AMDGPU] Ignore incorrect sub-group size #11687

[SYCL] [AMDGPU] Ignore incorrect sub-group size #11687

jchlanda commented Oct 27, 2023

jchlanda commented Nov 1, 2023

jchlanda commented Nov 8, 2023

smanna12 left a comment

againull commented Nov 8, 2023

jchlanda commented Nov 9, 2023

elizabethandrews commented Nov 9, 2023 •

edited

Loading

jchlanda commented Nov 15, 2023

jchlanda commented Nov 20, 2023

AlexeySachkov commented Nov 20, 2023

jchlanda commented Nov 21, 2023

AlexeySachkov commented Nov 21, 2023 •

edited

Loading

al42and commented Nov 21, 2023 •

edited

Loading

[SYCL] [AMDGPU] Ignore incorrect sub-group size #11687

[SYCL] [AMDGPU] Ignore incorrect sub-group size #11687

Conversation

jchlanda commented Oct 27, 2023

jchlanda commented Nov 1, 2023

jchlanda commented Nov 8, 2023

smanna12 left a comment

Choose a reason for hiding this comment

againull commented Nov 8, 2023

jchlanda commented Nov 9, 2023

elizabethandrews commented Nov 9, 2023 • edited Loading

jchlanda commented Nov 15, 2023

jchlanda commented Nov 20, 2023

AlexeySachkov commented Nov 20, 2023

jchlanda commented Nov 21, 2023

AlexeySachkov commented Nov 21, 2023 • edited Loading

al42and commented Nov 21, 2023 • edited Loading

elizabethandrews commented Nov 9, 2023 •

edited

Loading

AlexeySachkov commented Nov 21, 2023 •

edited

Loading

al42and commented Nov 21, 2023 •

edited

Loading